Antoine Beaupr : (Still) working too much on the computer
I have been using Workrave to try to force me to step away from
the computer regularly to work around Repetitive Strain Injury
(RSI) issues that have plagued my life on the computer intermittently
in the last decade.
Workrave itself is only marginally efficient at getting me away from
the machine: as any warning systems, it suffers from alarm fatigue
as you frenetically click the dismiss button every time a Workrave
warning pops up. However, it has other uses.
Analyzing data input
In the past, I have used Workrave to document how
I work too much on the computer,
but never went through more serious processing of the vast data store
that Workrave accumulates about mouse movements and
keystrokes. Interested in knowing how much my leave from Koumbit
affected time spent on the computer, I decided to look into this
again.
It turns out I am working as much, if not more, on the computer since
I took that "time off":
We can see here that I type a lot on the computer. Normal days range
from 10 000 to 60 000 keystrokes, with extremes at around 100 000
keystrokes per day. The average seem to fluctuate around 30 to 40 000
keystrokes per day, but rises sharply around the end of the second
quarter of this very year. For those unfamiliar with the underlying
technology, one keystroke is roughly one byte I put on the
computer. So the average of 40 000 keystrokes is 40 kilobyte (KB)
per day on the computer. That means 15 MB over a year or about
150MB (or 100 MiB if you want to be picky about it) over the
course of the last decade.
That is a lot of typing.
I originally thought this could have been only because I type more
now, as opposed to use more the mouse previously. Unfortunately,
Workrave also tracks general "active time" which we can also examine:
Here we see that I work around 4 hours a day continuously on the
computer. That is active time: not just login, logout time. In other
words, the time where i look away from the computer and think for a
while, jot down notes in my paper agenda or otherwise step away from
the computer for small breaks is not counted here. Notice how some
days go up to 12 hours and how recently the average went up to 7 hours
of continuous activity.
So we can clearly see that I basically work more on the computer now than
I ever did in the last 7 years. This is a problem - one of the reasons
of this time off was to step away from the computer, and it seems I
have failed.
Analyzing data input
In the past, I have used Workrave to document how
I work too much on the computer,
but never went through more serious processing of the vast data store
that Workrave accumulates about mouse movements and
keystrokes. Interested in knowing how much my leave from Koumbit
affected time spent on the computer, I decided to look into this
again.
It turns out I am working as much, if not more, on the computer since
I took that "time off":
We can see here that I type a lot on the computer. Normal days range
from 10 000 to 60 000 keystrokes, with extremes at around 100 000
keystrokes per day. The average seem to fluctuate around 30 to 40 000
keystrokes per day, but rises sharply around the end of the second
quarter of this very year. For those unfamiliar with the underlying
technology, one keystroke is roughly one byte I put on the
computer. So the average of 40 000 keystrokes is 40 kilobyte (KB)
per day on the computer. That means 15 MB over a year or about
150MB (or 100 MiB if you want to be picky about it) over the
course of the last decade.
That is a lot of typing.
I originally thought this could have been only because I type more
now, as opposed to use more the mouse previously. Unfortunately,
Workrave also tracks general "active time" which we can also examine:
Here we see that I work around 4 hours a day continuously on the
computer. That is active time: not just login, logout time. In other
words, the time where i look away from the computer and think for a
while, jot down notes in my paper agenda or otherwise step away from
the computer for small breaks is not counted here. Notice how some
days go up to 12 hours and how recently the average went up to 7 hours
of continuous activity.
So we can clearly see that I basically work more on the computer now than
I ever did in the last 7 years. This is a problem - one of the reasons
of this time off was to step away from the computer, and it seems I
have failed.
Update: it turns out the graph was skewed towards the last
samples. I went more easy on the keyboard in the last few days and
things have significantly improved:
Another interesting thing we can see is when I switched from using my
laptop to using the server as my main workstation, around early 2011,
which is about the time marcos
was built. Now that marcos
has been
turned into a home cinema downstairs, I went back to using my laptop
as my main computing device, in late 2015. We can also clearly see
when I stopped using Koumbit machines near the end of 2015 as well.
Further improvements and struggle for meaning
The details of how the graph was produced are explained at the end of
this article.
This is all quite clunky: it doesn't help that the Workrave
data structure is not easily parsable and so easily corruptible. It
would be best if each data point was on its own separate line, which
would be long, granted, but so easier to parse.
Furthermore, I do not like the perl/awk/gnuplot data processing
pipeline much. It doesn't allow me to do interesting analysis like
averages, means and linear regressions easily. It could be interesting
to rewrite the tools in Python to allow better graphs and easier data
analysis, using the tools I learned in
2015-09-28-fun-with-batteries.
Finally, this looks only at keystrokes and non-idle activity. It could be more interesting to
look at idle/active times and generally the amount of time spent on
the computer each day. And while it is interesting to know that I
basically write a small book on the computer every day
(according to Wikipedia, 120KB is about the size of a small pocket
book), it is mostly meaningless if all that stuff is machine-readable
code.
Where is, after all, the meaning in all those shell commands and
programs we constantly input on our keyboards, in the grand scheme of
human existence? Most of those bytes are bound to be destroyed by
garbage collection (my shell's history) or catastrophic backup
failures.
While the creative works from the 16th century can still be accessed
and used by others, the data in some software programs from the
1990s is already inaccessible. - Lawrence Lessig
But is my shell history relevant? Looking back at old posts on this
blog, one has to wonder if the
battery life of the Thinkpad 380z laptop
or
how much e-waste I threw away in 2005
will be of any historical value in 20 years, assuming the data
survives that long.
How this graph was made
I was happy to find that Workrave has some contrib scripts to do such
processing. Unfortunately, those scripts are not shipped with the
Debian package, so I requested that to be fixed
(#825982). There were also some fixes necessary to make
the script work at all: first, there was a syntax error in the Perl
script. But then since my data is so old, there was bound to be some
data corruption in there: incomplete entries or just plain broken
data. I had lines that were all NULL
characters, typical of power
failures or disk corruptions. So I have made a patch to fix that
script (#826021).
But this wasn't enough: while this processes data on the current
machine fine, it doesn't deal with multiple machines very well. In the
last 7 years of data I could find, I was using 3 different machines:
this server (marcos
), my laptop (angela
) and Koumbit's office
servers (koumbit
). I ended up modifying the contrib scripts to be
able to collate that data meaningfully. First, I copied over the data
from Koumbit in a local fake-koumbit
directory. Second, I mounted
marcos
home directory locally with SSHFS:
sshfs anarc.at:/home/anarcat marcos
I also made this script to sum up datasets:
#!/usr/bin/perl -w
use List::MoreUtils 'pairwise';
$ = 1;
my %data = ();
while (<>)
my @fields = split;
my $date = shift @fields;
if (defined($data $date ))
my @t = pairwise $a + $b @ $data $date , @fields;
$data $date = \@t;
else
$data $date = \@fields;
foreach my $d ( sort keys %data )
print "$d @ $data $d \n";
Then I made a modified version of the Gnuplot script that processes
all those files together:
#!/usr/bin/gnuplot
set title "Workrave"
set ylabel "Keystrokes per day"
set timefmt "%Y%m%d%H%M"
#set xrange [450000000:*]
set format x "%Y-%m-%d"
set xtics rotate
set xdata time
set terminal svg
set output "workrave.svg"
plot "workrave-angela.dat" using 1:28 title "angela", \
"workrave-marcos.dat" using 1:28 title "marcos", \
"workrave-koumbit.dat" using 1:28 title "koumbit", \
"workrave-sum.dat" using 1:2 smooth sbezier linewidth 3 title "average"
#plot "workrave-angela.dat" using 1:28 smooth sbezier title "angela", \
# "workrave-marcos.dat" using 1:28 smooth sbezier title "marcos", \
# "workrave-koumbit.dat" using 1:28 smooth sbezier title "koumbit"
And finally, I made a small shell script to glue this all together:
#!/bin/sh
perl workrave-dump > workrave-$(hostname).dat
HOME=$HOME/marcos perl workrave-dump > workrave-marcos.dat
HOME=$PWD/fake-koumbit perl workrave-dump > workrave-koumbit.dat
# remove idle days as they skew the average
sed -i '/ 0$/d' workrave-*.dat
# per-day granularity
sed -i 's/^\(........\)....\? /\1 /' workrave-*.dat
# sum up all graphs
cat workrave-*.dat sort perl sum.pl > workrave.dat
./gnuplot-workrave-anarcat
I used a different gnuplot script to generate the activity graph:
#!/usr/bin/gnuplot
set title "Workrave"
set ylabel "Active hours per day"
set timefmt "%Y%m%d%H%M"
#set xrange [450000000:*]
set format x "%Y-%m-%d"
set xtics rotate
set xdata time
set terminal svg
set output "workrave.svg"
plot "workrave-angela.dat" using 1:($23/3600) title "angela", \
"workrave-marcos.dat" using 1:($23/3600) title "marcos", \
"workrave-koumbit.dat" using 1:($23/3600) title "koumbit", \
"workrave.dat" using 1:($23/3600) title "average" smooth sbezier linewidth 3
#plot "workrave-angela.dat" using 1:28 smooth sbezier title "angela", \
# "workrave-marcos.dat" using 1:28 smooth sbezier title "marcos", \
# "workrave-koumbit.dat" using 1:28 smooth sbezier title "koumbit"
While the creative works from the 16th century can still be accessed and used by others, the data in some software programs from the 1990s is already inaccessible. - Lawrence LessigBut is my shell history relevant? Looking back at old posts on this blog, one has to wonder if the battery life of the Thinkpad 380z laptop or how much e-waste I threw away in 2005 will be of any historical value in 20 years, assuming the data survives that long.
How this graph was made
I was happy to find that Workrave has some contrib scripts to do such
processing. Unfortunately, those scripts are not shipped with the
Debian package, so I requested that to be fixed
(#825982). There were also some fixes necessary to make
the script work at all: first, there was a syntax error in the Perl
script. But then since my data is so old, there was bound to be some
data corruption in there: incomplete entries or just plain broken
data. I had lines that were all NULL
characters, typical of power
failures or disk corruptions. So I have made a patch to fix that
script (#826021).
But this wasn't enough: while this processes data on the current
machine fine, it doesn't deal with multiple machines very well. In the
last 7 years of data I could find, I was using 3 different machines:
this server (marcos
), my laptop (angela
) and Koumbit's office
servers (koumbit
). I ended up modifying the contrib scripts to be
able to collate that data meaningfully. First, I copied over the data
from Koumbit in a local fake-koumbit
directory. Second, I mounted
marcos
home directory locally with SSHFS:
sshfs anarc.at:/home/anarcat marcos
I also made this script to sum up datasets:
#!/usr/bin/perl -w
use List::MoreUtils 'pairwise';
$ = 1;
my %data = ();
while (<>)
my @fields = split;
my $date = shift @fields;
if (defined($data $date ))
my @t = pairwise $a + $b @ $data $date , @fields;
$data $date = \@t;
else
$data $date = \@fields;
foreach my $d ( sort keys %data )
print "$d @ $data $d \n";
Then I made a modified version of the Gnuplot script that processes
all those files together:
#!/usr/bin/gnuplot
set title "Workrave"
set ylabel "Keystrokes per day"
set timefmt "%Y%m%d%H%M"
#set xrange [450000000:*]
set format x "%Y-%m-%d"
set xtics rotate
set xdata time
set terminal svg
set output "workrave.svg"
plot "workrave-angela.dat" using 1:28 title "angela", \
"workrave-marcos.dat" using 1:28 title "marcos", \
"workrave-koumbit.dat" using 1:28 title "koumbit", \
"workrave-sum.dat" using 1:2 smooth sbezier linewidth 3 title "average"
#plot "workrave-angela.dat" using 1:28 smooth sbezier title "angela", \
# "workrave-marcos.dat" using 1:28 smooth sbezier title "marcos", \
# "workrave-koumbit.dat" using 1:28 smooth sbezier title "koumbit"
And finally, I made a small shell script to glue this all together:
#!/bin/sh
perl workrave-dump > workrave-$(hostname).dat
HOME=$HOME/marcos perl workrave-dump > workrave-marcos.dat
HOME=$PWD/fake-koumbit perl workrave-dump > workrave-koumbit.dat
# remove idle days as they skew the average
sed -i '/ 0$/d' workrave-*.dat
# per-day granularity
sed -i 's/^\(........\)....\? /\1 /' workrave-*.dat
# sum up all graphs
cat workrave-*.dat sort perl sum.pl > workrave.dat
./gnuplot-workrave-anarcat
I used a different gnuplot script to generate the activity graph:
#!/usr/bin/gnuplot
set title "Workrave"
set ylabel "Active hours per day"
set timefmt "%Y%m%d%H%M"
#set xrange [450000000:*]
set format x "%Y-%m-%d"
set xtics rotate
set xdata time
set terminal svg
set output "workrave.svg"
plot "workrave-angela.dat" using 1:($23/3600) title "angela", \
"workrave-marcos.dat" using 1:($23/3600) title "marcos", \
"workrave-koumbit.dat" using 1:($23/3600) title "koumbit", \
"workrave.dat" using 1:($23/3600) title "average" smooth sbezier linewidth 3
#plot "workrave-angela.dat" using 1:28 smooth sbezier title "angela", \
# "workrave-marcos.dat" using 1:28 smooth sbezier title "marcos", \
# "workrave-koumbit.dat" using 1:28 smooth sbezier title "koumbit"
sshfs anarc.at:/home/anarcat marcos
#!/usr/bin/perl -w
use List::MoreUtils 'pairwise';
$ = 1;
my %data = ();
while (<>)
my @fields = split;
my $date = shift @fields;
if (defined($data $date ))
my @t = pairwise $a + $b @ $data $date , @fields;
$data $date = \@t;
else
$data $date = \@fields;
foreach my $d ( sort keys %data )
print "$d @ $data $d \n";
#!/usr/bin/gnuplot
set title "Workrave"
set ylabel "Keystrokes per day"
set timefmt "%Y%m%d%H%M"
#set xrange [450000000:*]
set format x "%Y-%m-%d"
set xtics rotate
set xdata time
set terminal svg
set output "workrave.svg"
plot "workrave-angela.dat" using 1:28 title "angela", \
"workrave-marcos.dat" using 1:28 title "marcos", \
"workrave-koumbit.dat" using 1:28 title "koumbit", \
"workrave-sum.dat" using 1:2 smooth sbezier linewidth 3 title "average"
#plot "workrave-angela.dat" using 1:28 smooth sbezier title "angela", \
# "workrave-marcos.dat" using 1:28 smooth sbezier title "marcos", \
# "workrave-koumbit.dat" using 1:28 smooth sbezier title "koumbit"
#!/bin/sh
perl workrave-dump > workrave-$(hostname).dat
HOME=$HOME/marcos perl workrave-dump > workrave-marcos.dat
HOME=$PWD/fake-koumbit perl workrave-dump > workrave-koumbit.dat
# remove idle days as they skew the average
sed -i '/ 0$/d' workrave-*.dat
# per-day granularity
sed -i 's/^\(........\)....\? /\1 /' workrave-*.dat
# sum up all graphs
cat workrave-*.dat sort perl sum.pl > workrave.dat
./gnuplot-workrave-anarcat
#!/usr/bin/gnuplot
set title "Workrave"
set ylabel "Active hours per day"
set timefmt "%Y%m%d%H%M"
#set xrange [450000000:*]
set format x "%Y-%m-%d"
set xtics rotate
set xdata time
set terminal svg
set output "workrave.svg"
plot "workrave-angela.dat" using 1:($23/3600) title "angela", \
"workrave-marcos.dat" using 1:($23/3600) title "marcos", \
"workrave-koumbit.dat" using 1:($23/3600) title "koumbit", \
"workrave.dat" using 1:($23/3600) title "average" smooth sbezier linewidth 3
#plot "workrave-angela.dat" using 1:28 smooth sbezier title "angela", \
# "workrave-marcos.dat" using 1:28 smooth sbezier title "marcos", \
# "workrave-koumbit.dat" using 1:28 smooth sbezier title "koumbit"